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ABSTRACT 

Minimum  distance  parameter  estimation  using  weighted 
Cramer-von  Mises  statistics  is  considered  for  the  general  one¬ 
dimensional  case.  Under  rather  general  conditions,  the  derived 
estimators  are  asymptotically  normal.  Consideration  is  given  to 
appropriate  weights  to  produce  Fisher-efficient  estimators.  In 
fact,  estimators  can  be  obtained  with  influence  curves  pro¬ 
portional  to  any  desired  smooth  function,  and  hence  pre¬ 
scribed  first-order  robustness  properties.  Many  such  curves  (any 
"redescending"  influence  curve)  are  shown  to  require  weight 
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1.  INTRODUCTION  AND  NOTATION 

We  consider  minimum  distance  (MD)  estimation  utilizing  the 
weighted  Cramer-von  Mises  discrepancy.  Letting  r  =  {Fo,0  e  ft} 
denote  a  parametrized  family  of  distribution  functions  (herein 
termed  the  model) ,  and  Gn  denote  the  empirical  distribution 
function  based  upon  a  random  sample  of  size  n  from  some  distri¬ 
bution  function  G,  we  write 

«<,  «vV  ’  -f*0-  -  Wed"  ’  (1-u 

r9  -00 


where  u  denotes  Lebesgue  measure.  The  factor  will  be  referred 

to  as  the  "weight  function",  and  will  typically  (although,  as 

we  shall  see,  not  for  a  number  of  important  cases  of  interest) 

be  nonnegative  and  possess  certain  smoothness  properties. 

In  the  context  of  robust  estimation,  although  we  hope 

G  e  T,  i.e.  G  =  Fg  for  some  9  e  &,  F  is  more  realistically 
o 

to  be  regarded  as  a  model  selected  as  containing  a  reasonable 

approximation  to  G.  Even  in  the  cases  where  G  t  T,  minimization 

of  6.  (G  ,Fq)  over  all  9  e  ft  to  obtain  a  MD-estimator  typically 
V0  n  9 

(under  broad  regularity  conditions)  results  in  a  situation  where 
the  estimand  -  that  value  for  which  the  estimator  is  consistent  - 
has  an  intrinsic  probabilistic  meaning  in  that  it  is  associated 
with  the  best  approximation  (in  a  specified  sense)  to  G  in  r  . 
(Parr  and  Schucany  (1980)  give  further  discussion  of  this  point.) 

We  shall  formally  define  the  MD-estimator  based  on  Gn  with 
respect  to  T  and  6  as  "the"  solution  of 
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Thus,  A  (T(G  ))  =  0  defines  our  estimator  T(G  ),  (we  assume 
v  n  n 

n 

a  unique  method  of  choosing  a  consistent  solution  -  by  a  local 
convexity  argument,  all  consistent  solutions  will  be  /n  -  equi¬ 
valent.)  and  our  estimand  is  a  solution  of  A^(T(G))  *  0.  In  most 
of  the ^typical  contexts  considered,  another  Sn  -  consistent  esti¬ 
mator  0  of  1(G)  will^exist,  and  we  may  simply  choose  the  solution 
of  (1.2)  closest  to  0. 

The  influence  curve  (see  Hampel  (1974))  of  the  MD-estimator 

obtained  as  a  root  of  (1.2),  or  equivalently  by  minimizing  (1.1), 

may  be  seen  by  straightforward  calculation  to  be  (when  G  =  F  ) 

0 
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-co  <  c  <  “(<5  (x)  *  I,  .(x))  ,  for  all  0  e  fi.' 
c  (c,“>7 

In  Section  2  we  examine  efficient  estimation  for  one-para¬ 
meter  problems,  with  the  normal,  double-exponential  and  t-distri- 
butions  as  examples)  for  the  location  problem,  and  the  normal 
distribution  for  the  scale  problem.  Extensions  to  multi-para- 
meter  situations  are  discussed.  Section  3  consists  of  a  series 
of  comments  on  the  robustness  of  estimators  derived  in  this  fashion, 
and  exhibits  asymptotic  equivalents  for  some  familiar  "robust" 
estimators.  Section  4  gives  arguments  for  preferring  MD  estimators 
to  other  locally  asymptotically  equivalent,  but  perhaps  compu¬ 
tationally  simpler,  estimators.  Section  5  discusses  some  sugges¬ 
tions  regarding  extension  of  the  work  in  the  preceding  sections  as 
well  as  its  numerical  implementation.  While  the  discussion  in 
these  first  five  sections  has  been  largely  intuitive,  avoiding 
technical  details  and  regularity  conditions,  a  pr'oof  of  the  main 
result  is  given  in  the  appendix. 
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2.  EFFICIENT  ESTIMATION  IN  ONE-PARAMETER  PROBLEMS 


For  many  estimation  problems  there  exists  an  expansion  of 
the  form 


1  n 

T[G  ]  =  T[F J  +  ~  l  IC, 


n  u  t  F  (V  +  °p(  -  } 

n  i=i  ’  e  p 


as  n  •*  <*>  . 


(2.1) 


This  justifies  (asymptotically)  the  usual  interpretations  of 

the  influence  curve,  and  tells  us  that  /n(T[G ]  -  TfF  ])-^-»- 

n  0 

N(0,E_,  [IC^,  _  (X)]).  For  sufficiently  "regular"  estimators,  a 

8  T*Fe 

stronger  expansion 

T[H]  =  T[F0]  +  /lCTjP^(x)dH(x)+o  (||H  -  Ffl||)  (2.2) 
as  |(H  -  Fg  ||  -v.  0 


holds,  where  ||  •  |j  is  a  norm  on  the  space  of  distribution 

functions  such  that  II G  -  Gil  =  0  (  —  )  as  n  +  »,  where  G  and 

11  n  11  P  ^  n 

G  are  as  in  Section  1  (see  Huber  (1977)  or  Boos  and  Serfling 

(1980)).  From  this,  we  can  see  that  an  estimator  which  is 

efficient  in  the  Fisher-Rao  sense  will  have 
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Equating  the  right-hand  sides  of  (1.3)  and  (2.3)  and  differen¬ 
tiating  with  respect  to  c,  we  find  that  when  IC^  p(c)  is 


continuous  in  c 
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yields  in  general  IC_  _  (c)  proportional  to  the  expression  in  (2.3), 

T,F0 

and  in  particular  for  location  or  scale  models,  the  attained 

IC_  _  (c)  is  exactly  that  of  (2.3).  Boos  (1980)  also  gives  (2.4) 
T,Fq 


for  the  location  case.  Whether  this  holds  for  parameters  of  non¬ 
location  or  scale  type  must  be  determined  by  inspection  in  each  case. 
Thus,  except  in  special  cases,  full  efficiency  will  not  be  consistent 
with  the  restriction,  typical  for  goodness  of  fit  applications,  that 
'I'q(c)  =  i|r*(FQ(c)).  Note  that  ^.(*)  will  vary  with  6  in  the  mini¬ 


mization. 

Thus,  minimization  of  6^  (G^,  F0)  with  respect  to  9  should  in 

0 

many  cases  produce  asymptotically  fully  efficient  MD-estimators  of 
9,  subject  to  the  regularity  conditions  outlined  in  the  appendix. 
For  location  estimation  problems,  (2.4)  reduces  to 


*6  (O  = 


32log  f q (c) 
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(2.5) 


Hence,  for  the  problem  of  estimation  of  the  location  parameter 
of  a  normal  population  (a  known  and  taken  to  be  equal  to  1  without 
loss  of  generality),  the  weight  function  for  efficient  estimation 
is  found  to  be 


t(/*(c  -  9)  * 

This  yields  IQ_  _  (c) 
^T,F0 


*0(c)  -  l/f2(c  -  9)  -  2Tie(c"6)2  . 
■c-9,  -®<c<»  and  an  asymptotic 
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(2.6) 
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equivalent  at  the  normal  parent  for  the  sample  mean.  Using  the 

technique  of  De  Wet  (1980)  it  can  be  shown  that  the  weighted 
* 

Cramer-von  Mises  statistic  with  weight  function  (2.6)  has  the 
maximum  approximate  Bahadur  slope  in  testing  for  the  unit  normal 
against  shift  alternatives.  This  "coincidence”  is  natural  in  the 
light  of  the  results  of  Hodges  and  Lehmann  (1963),  with  the  new 
twist  that  in  the  present  case  the  inverted  test  is  not  asympto¬ 
tically  normal. 

The  double  exponential  model  provides  an  interesting  case 
in  which  the  ideal  weight  function  is  a  point  mass  at  0,  i.e. 

4»*(x  -  6)  is  such  that  the  integral 

/  i|i*(x  -  0)dx  =  I  (0)  ,  (2.7) 

A  A 

where  I.(*)  is  the  indicator  function.  Note  that  this  result 
does  not  follow  from  (2.5)  but  from  the  fact  that  the  minimum 
resulting  from  use  of  (2.7)  is  precisely  the  sample  median,  when 
F„  is  strictly  increasing  at  its  median  9. 

The  t-distributions  constitute  a  broad  class  ranging  from 
the  Cauchy  to  the  normal.  Here,  the  efficient  weight  function 
is  given  (for  k  degrees  of  freedom)  by 

'll* (x  -  9)  -  (k  -  (x  -  9)2)(k  +  (x  -  0)2)k"‘1  .  (2.8) 


Note  that  this  weight  function  gives  negative  weight  to  extreme 
values  of  x  -  0,  i.e.  for  |x  -  0j  >  This  corresponds  in  fact 

to  the  following  basic  principle.  Since 


Vx) 


3ICt  _(x) 

9x  /f02^x) 


(2.9) 


is  the  weight  function  designed  to  result  in  an  estimator  with 

influence  curve  IC_  ^(c)  in  the  location  problem,  a  redescending 
l ,  r 


influence  curve  can  only  be  achieved  by  use  of  a  weight  function 
which  is  negative  for  some  values.  This  may  well  serve  to  bolster 
the  misgivings  some  feel  regarding  these  redescenders  -  they  are 
asymptotically  equivalent  to  MD-estimators  based  on  Cramer-von 
Mises  discrepancies  with  weights  which  are  not  nonnegative! 

Curiously,  the  optimal  weight  function  for  estimation  of 
the  standard  deviation  of  a  normal  population  with  mean  known  and 
equal  to  0  wlog  turns  out  to  be 


(2-10> 

similar  to  the  optimal  weighting  for  location.  This  strongly 
suggests  that  simultaneous  minimization  of 
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with  respect  to  u  and  o  will  yield  efficient  estimators  of  p  and 
o  in  the  normal  location  and  scale  problem,  where 


Note  that  we  would  actually  define  our  estimators  as  the  joint 
roots  of  the  simultaneous  equations 


35,  (G  ,F  ) 
ib  n  y,a 

- -  =  0 

3y 


(2.13) 


and 
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A  last  intriguing  example  for  location  estimation  is  the 
logistic  family  (scale  known) ,  where  it  follows  that  the  optimal 
weight  function  is  that  of  the  Anderson-Darling  statistic. 


*0(x)  =  [F0(x)(l  -  F0(x))] 


For  the  simple  exponential,  F(x,6)  =  1  -  e 
and  the  optimal  weight  is 


-x/0 


x  >  0, 
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(2.14) 


3. _ ROBUST  WEIGHTED  CRAMER-VON  MISES  ESTIMATION 


However,  the  ability  to  generate  asymptotically  efficient  MD- 
estimators  for  a  given  parametric  model  would  in  itself  be  of 
little  value,  given  the  existence  of  asymptotically  efficient  L- 
estimators  for  location/scale  problems  (see  Chernoff,  Gastwirth 
and  Johns  (1967))  and  the  option  of  using  maximum  likelihood  esti¬ 
mation  for  non- location/scale  problems.  Equation  (1.3)  gives 
the  key  to  a  more  significant  application,  however.  Differenti¬ 
ation  of  both  sides  of  (1.3)  with  respect  to  c  yields 

3ICt  v  (c) 

T’Fe 

- 8~8F.(c)-  <3'1) 

c  f .  y  .  .  . 


as  the  weight  function  designed  to  give  the  associated  estimators 

a  specified  (differentiable)  influence  curve  IC_  _,(•)•  Our  in- 

.  T’F© 

tuitive  desire  for'a  bounded  weight  function  ip  (•)  thus  corresponds 

to  requiring  IC„  _  (c)  to  be  "extremely  stable"  for  c  in  regions 
T,F0 


8 


3Fe(c) 

where  f  (c)  — — —  is  small  (typically  c  -*■  ±  »  )  . 

0  00 

Thus  we  may,  as  is  the  case  with  M-estimation ,  obtain  esti¬ 
mators  with  influence  curve  proportional  to  any  desired  function 
(up  to  regularity  conditions  which  will  typically  be  satisfied  for 

cases  with  IC^  (•)  bounded  and  "smooth").  There  is  the  further 
T’F8 

benefit  that,  opposed  to  M-estiraation,  when  the  model  F  is  not 
true,  but  only  contains  a  reasonable  approximation  to  G,  the  value 
0q  for  which  the  Mil-estimator  is  consistent  possesses  an  inter¬ 
pretable  probabilistic  meaning  as  discussed  below  in  Section  4. 

For  the  normal  location  problem,  where  the  optimal  weight 
function  is 

i(j0(c)  -  l/f2(c  -  8)  =  0([u(l  -  u)]  [log(u(l  -  u))]-1)  ,(3.2) 

as  c  +  ±  “  ,  with  u  =  F(c  -  0),  a  natural  modification  to  avoid 
the  unbounded  weight  function  is 


Vc)  58 


l/f2(x  -  0) 


c  -  0  <  k 


c  -  0  >  k 


(3.3) 


for  some  fixed  k  >  0.  This  weight  function  "trims"  the  region 

of  integration  in  the  discrepancy  5  (G  ,  F  ),  and  in  fact  yields 

^8  n  9 

a  local  almost  sure  /n  -  equivalent  of  the  trimmed  mean  with 
trimming  proportion  1  -  $(k),  where  $  is  the  unit  normal  cumulative. 
The  local  equivalence  in  fact  suggests  usage  of  the  trimmed  mean 
as  a  strategic  starting  value  for  the  iterative  procedure  for 
computation  of  this  MD-estimator . 

Equation  (3.1)  suggests  equivalents  to  other  well-known 
"robust"  estimators.  Note,  however,  that  the  asymptotic  /a  - 
equivalences  hold  iii  general  only  at  the  model  -  the  estimators 
will  have  different  behavior  away  from  the  strict  parametric 
model.  A  local  asymptotic  equivalent  to  a  general  M-estimator 
of  location  9^  is  provided  (for  any  fixed  density  f)  by  taking 


0 


otherwise  , 


(3.4) 


for  some  constant  s 


>  0  ,  where  the  M-estimator  9  is  defined  by 

n  J 
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(3.5) 


4.  .  MINIMUM  CVM  NORM  ESTIMATION  AND  OTHER  METHODS 

It  may  well  be  asked  why  one  should  use  a  minimum  CVM 
norm  estimator  when  other  (computationally  simpler)  methods 
exist  which  are  asymptotically  equivalent  when  the  model  r  is 
the  correct  one,  that  is,  when  G£  f.  Section  5  deals  briefly 
with  the  issue  of  computational  simplicity.  Reasons  for  pre¬ 
ferring  minimum  CVM  estimators  include  i)  a  concrete  proba¬ 
bilistic  interpretation  for  the  estimator  when  G  4  r,  a  property 
not  shared  by  the  other  methods;  ii)  the  greater  ease  of  appl¬ 
ying  minimum  CVM  norm  estimators  to  complex  problems  not  invol¬ 
ving  artificial  symmetries,  and  their  robustness  properties; 
iii)  desirable  properties  of  minimum  CVM  norm  estimators  as 
indicated  by  Millar  (1979);  and  iv)  the  extremely  competitive 
small  sample  behavior  of  minimum  CVM  norm  estimators  as  shown 
by  Parr  and  Schucany  (1980)  in  the  location  problem. 

To  understand  the  benefit  of  minimum  CVM  norm  estimation 
as  giving  answers  with  concrete  probabilistic  interpretations, 
consider  the  case  G  i  T.  Here,  under  suitable  regularity  con¬ 
ditions,  the  estimator  converges  to  the  value  9q  e  0  such  that 


For  instance,  if  =  1/f  ,  0  gives  a  best  L2  approximate  to  G 

U  DO 

among  the  distribution  functions  FQ  e  T.  More  generally,  5_ 


defines  a  notion  of  the  "distance"  measured  in  probability  units 

between  two  distribution  functions,  and  6  minimizes  that  dis- 

o 

tance.  Hence  if  6 is  well  chosen  a  reasonable  approximation 
to  G  is  obtained.  Neither  M  or  L  estimators  have  this  property. 
M-estimators  converge  (under  suitable  regularity  conditions)  to 
a  solution  of  a  linear  equation  in  G  having  no  necessary  intrinsic 
meaning  when  G  i  r. 


In  many  complex  problems,  the  CVM  based  estimators  are 
easier  to  apply  than  M  or  L  estimators.  The  relative  scarcity 
of  L  estimators  proposed  for  non-location  or  scale  problems 
which  are  robust  and  interpretable  when  G  i  F  serves  to  illu¬ 
strate  this  point.  Application  of  M-estimation  to  such  prob¬ 
lems  involves  the  extremely  complicated  solution  of  the  equa¬ 
tions  of  Huber  (1977,  p.33).  For  nonsymmetric  or  non-additive 
errors,  this  is  an  unsolved  problem  for  practical  applications. 

By  way  of  contrast,  robustness  against  gross  errors  of  a  CVM 
based  estimator  can  usually  be  achieved  by  keeping  Qq  small 
(that  is,  /l^gjfgdy  <  c<  00  for  all  0  e  ft) ,  while  still  preserving 
consistency  of  the  derived  estimator  when  G  e  T. 

Millar  (1979)  indicates  a  number  of  desirable  features 
of  minimum  CVM  norm  estimators  in  a  precisely  specified  mini¬ 
max  sense  against  sequences  of  alternatives  approaching  the  model 
F.  This  further  bolsters  the  intuitive  notion  that  the  estimators 
should  have  good  properties  when  the  model  is  not  exactly  true, 
but  still  contains  a  reasonable  approximation  to  G. 

Parr  and  Schucany  (1980)  report  partial  results  from  an 
extensive  Monte  Carlo  study  comparing  minimum  CVM  norm  estimators 
to  the  best  of  the  M  and  L  estimators  for  location  examined  in 


11 


the  Princeton  Study  (Andrews,  zt  at.  (1972)).  They  find  the  CVM 
based  estimators  to  be  highly  competitive  with  these  others, 
which  were  chosen  for  inclusion  because  of  their  previously 
documented  excellent  behavior  at  the  distributions  examined  in 
this  later  comparative  study. 

5.  COMPUTATIONAL  MATTERS,  EXTENSIONS  AND  CONCLUSIONS 


Unfortunately,  for  a  variety  of  possible  weight  functions, 

the  discrepancy  5 (G^,  F0)  does  not  admit  a  simple  calculating 
0 

form.  However,  the  alternative  version 

s;9(<v  V  - ;  jl<p9<xa)>  -  rrr>  W1  dr  >  «•» 


(where  is  the  i 


th 


order  statistic)  which  yields  an  estimator 
with  the  same  asymptotic  distribution  as  the  one  obtained  by 


inverting  <5  ,  (G  ,  F.),  is  much  easier  to  calculate  when  6 ,  (G  , 
Vg  n  0  n 

does  not  integrate  in  closed  form.  In  fact,  the  derivatives  of 

6*  (Gq,  Fq)  generally  possess  simple  forms,  permitting  the  use 
9 

of  a  Newton-Raphson  routine  for  computation  of  the  estimator. 
Thus,  the  estimators  are  essentially  no  more  costly  to  compute 
than  the  usual  M-estimators .  (See  DeWet  and  Venter  (1973)  where 
this  statistic  is  used  in  a  goodness-of-f it  setting). 

Alternatively,  in.  the  location  case,  where  for  reasons  of 


V 


r) ,  a  function  free  of  0, 


invariance  VF0’1{;rTT})  “  h(^  + 
weighted  nonlinear  least  squares  techniques  can  be  employed  to 

minimize  6*  (G^,  F0).  In  experimentation  involving  this  second 

9 

method,  the  first  author  has  used  standard  nonlinear  regression 
packages  to  compute  the  MD-estimator  and  obtained  convergence 
to  sufficient  accuracy  for  applications  typically  in  less  than 
five  iterations. 


J 
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Although  (2.10)  gave  a  single  weight  function  which  yielded 
jointly  optimal  estimators  for  the  location  and  scale  of  a  normal 
distribution,  a  single  such  weight  function  does  not  exist  for 
multiparameter  parameter  problems  in  general.  For  a  p  dimen¬ 
sional  parameter  0/  -  (8^,  0  ),  jointly  optimal  esti¬ 

mators  are  however  available  as  the  joint  roots  in  0^,  02»...,  0p 
of 


36 


(rv  V 


e,i 


30, 


-  0,  i  =  1,  2, 


(5.2) 


where 


-32 log  frt(c) 


9.1 


30i3c 


f0(c) 


3VC) 

30 . 


i  =  1,  . . .  ,  p. 


(5.3) 


Such  a  procedure  would  be  necessary  if  optimality  of  the  vector 

A 

estimator  9^ were  important.  It  will  be  noted  that  equations 
(5.2)  correspond  to  (in  fact  under  general  conditions  are  asymp¬ 
totically  equivalent  to)  the  likelihood  equations. 

6.  APPENDIX 


In  this  section  we  give  conditions  for  the  MD-estimator 

based  upon  5  ,  (G  ,  FQ)  to  be  asymptotically  normal  (and  Fisher- 
<P0  n  0 


Rao  efficient  with  proper  choice  of  4>0»  although  our  result  allows 
other  weightings).  We  assume  that  strong  consistency  of  the 
selected  root  of  (1.2)  is  guaranteed  by  some  result  such  as 
Theorem  2  of  Parr  and  Schucany  (1980),  or  one  of  its  modifi¬ 
cations  stated  there,  we  take  0Q  ■  0  without  loss  of  generality, 

write  Fq  «  F  -  G,  fQ  -  f,  and  define 
o  o 
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h(t>  -< 


Vc) 


X'  (T(F)> 
v.  F 


t^O 

t  =  0 


(6.1) 


(Note  Xp(t(F))  *  T(F)  =*  0,  and  hence  h(t)  is  just  the  difference 

quotient  when  t  J  T[F] .  Thus,  h(t)  is  continuous  at  t  -  T[F]  by 

differentiability  of  Xp(c)  at  T[F]). 

It  is  sufficient  for  asymptotic  normality  that  IC_  (c)  as  in 

T  f  F 

(1.3)  be  square  integrable  with  respect  to  F,  and 


TIG  ]  -  T[G]  -  H(G  )  /lC_  _(c) dG  (c)  =  o 

n  n  T,F  n  p\^l 


(6.2) 


where  H(Gn)  1  w.p.l.  We  take 


X'G(T[G]) 


H(Gn}  h(T[G  ]) 


(6.3) 


and  then 


I T [ G  ]  -  T[F]  -  H(G  )  /lC_  „(c)dG  (c) 
n  n  1 ,r  n 


h(T[Gn]) 


3F 


2/ to.  -niir V 


30  9  9 


0=T [G  ] 
n 


3F0 

“30"  '*'0t9 


+  /(F  -  GJ(F  +  Gn  -  2Fe)^  [*ef0] 


dv 


6=T[G  ] 
n 


6=0 
(6.4) 


>dv  ■ 


after  some  elementary  algebraic  operations.  Since  h(T[Gnl) 

X’  (T(F))  >  0  w.p.l  by  assumption,  we  need  merely  show  each 
r 


of  the  terms  in  the  second  factor  to  be  o 


term  it  is  clear  that 


a 


r 


For  the  first 


3F0 

1 

2/(Gn  -  F) - 

- —  lb  f 

39  0 

e.i[c  l  '  'l’9E(l 

0-0 

>dy 

s. 

n 

J 
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2  sup 


3F„ 


|g  -  f|  i/  , 

n  —  / { F(1  -  F)}  /z~Z 


{F(l  -  F)} 


V2  -e 


Since  sup 


— -  ill  f 

39  e 


Gn  ~  F 


6=T [G  ] 
n 


8F0 

— —  f 

36  v90 


0=0 


(6.5) 


du 


- -  =  o 

V2  -  e  p 


(—]  for  0  <  e  <  V2  and  T[G  ] 

Wj 


(F(l  -  F)} 

is  strongly  consistent,  it  follows  that  a  sufficient  condition 
for  the  first  term  to  be  of  proper  order  is 


/{ F (1  -  F)}  * 


Vo  -  £ 


3F0 

ae  ^efe 


0=c 


!V  f 

30  ^9f0 


9=0 


du  0  ,  (6.6) 


as  c  -*■  0.  (For  specific  cases  such  as  location  estimation,  this 
simplifies  considerably). 

For  the  second  term  in  the  sum  on  the  RHS  of  (6.4),  it 
suffices  to  show  that  both 


-  G„>2  w  <V.> 


dy  =  o  - ] 

0=T  [G  ]  PW/ 


(6.7) 


and 


2/(F  -  C.XG  -  Fe)  l*0£e] 


dy  =  o  —  .  (6.8) 

0=T[G  ]  P\/aJ 


The  first  of  these  [viz  (6.7)]  is  bounded  (for  0  <  5  <  1)  in 
absolute  value  by 


sup 


lG  -  FI2 

1  n 


1 1~ <5 


/{ F(1  -  F)> 


1-6 


^{F(l  -  F)} 

and  the  second  (6.8)  (for  0  <  t  <  V2  )  by 

./{ F(1  -  F)}  1/2  "£ 


30  ^0f0^ 


0=T[Gn] 


dy 

(6.9) 


2  sup 


G  -  F 
1  n  ' 


({F(l  -  F» 


^F  ”  39  ^9fe^ 


dy. 

9-T[G] 


(6.10) 
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Thus  if 


/(F(l  -  F)}1-8^  [*efe!du  < 


for  9  in  some  neighborhood  of  0,  and 


lim/{F(l  -  F)}  V2"£  (F  -  FQ)  ±  [^ef0]dy 
0-*-O 


these  terms  are  also  o 


/l_\ 

’Uj  ■ 


Collecting  our  conditions,  with 


all  notation  as  before,  we  thus  have  the  following. 

QO 

Theorem.  If  (T[G  is  a  sequence  of  D-estimators  based  upon 

00 

the  sequence  {G  }  .,  with  respect  to  F  and  6  ,  chosen  as  a 

n  n=  ^0 

solution  of  (1.2),  and 

i)  TfG^]  is  strongly  consistent, 

ii)  G  =  F  for  some  Q  £  Q  (taken  without  loss  of  generality 

u 

to  be  9  =  0), 

iii)  lp(c)  is  differentiable  at  T[F]  and  X'F(T(F))  >  0, 

iv)  0  <  /IC2  p(c)dF(c)  =  a2  <  and 


v)  lira  /{ F(1  -  F)}  !/2  “  £ 
c-f-0 


J{F(1  -  F)}^  [iJ)Qf  ]dy  <  <*>  for  6  in  some  neighbor- 
0  0  0  0 

hood  of  0, 

and  lim  /{F(l  -  F)}  ^  "  e 
0-+-O 

0  <  e  <  ,  then  i/n[T[G  ]  -  T [ G]  ]  °  N(0,a2).  (In  fact,  under 

n 

these  conditions  we  have  an  almost  sure  representation  of  the 
estimator,  as  can  be  easily  seen  from  an  examination  of  the 
proof).  3F 

It  should  be  noted  that  finiteness  of  f~rr-  i)i.f.dy  and 
/—  (ip0f0]dy  for  in  a  neighborhood  of  zero  is  sufficient  but 
not  necessary  for  iii)-v)  to  be  satisfied.  Thus,  for  the  case 
of  location  estimation,  a  set  of  sufficient  conditions  to  replace 
iii)-v)  is 


3F  3F 

_ c  .  _ 0  . 

3c  ^c f c  _  39  ^9 f 9 


0=0 


dp  -  0, 


(F  ”  V  30 


dp  =  0  for  some 


3^0 

iii*)  il)»  and  are  bounded 

0  38  e=o 

iv*)  /fg2dp  <  co 
and  v*)  /fg'du  <  00 . 

These  conditions  are  quite  reasonable  due  to  robustness  consider¬ 
ations  which  make  bounded  weights  \J»  desirable  and  the  requirement 
of  a  square  integrable  density  with  finite  Fisher's  information 
a  natural  smoothness  requirement.  Boos  (1980)  gives  moment-type 
Conditions  for  the  analogue  of  the  above  theorem  when  9  is  a 
location  parameter. 
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